A Detailed Study on Indian Languages Text Mining
نویسندگان
چکیده
India is a country with huge population of over hundred and twenty seven core, who speaks different languages. Only 5% of Indian population can effectively communicate in English and rest 95% are comfortable with their regional languages. India is certainly one of the multilingual nations in the world
منابع مشابه
A Study and Comparative Analysis of Different Stemmer and Character Recognition Algorithms for Indian Gujarati Script
A lot of work has been reported on optical character recognition for various non-Indian scripts like Chinese, English and Japanese and Indian scripts like Tamil, Hindi Telugu, etc. , in this paper, we present a literature review on stemmer, optical character recognition (OCR) and Text mining work on Indian scripts, mainly on the Gujarati languages. We have discussed the different techniques for...
متن کاملA Survey on text categorization of Indian and non-Indian languages using supervised learning techniques
Categorization of text plays an important role in the text mining field. Text categorization is the process in which documents are categorized into its predefined category. Automatic text categorization is an important task due to large amount of electronic documents. This paper presents a survey of Text categorization of Indian and non-Indian languages. There is very less work done in text cat...
متن کاملخوشهبندی اسناد مبتنی بر آنتولوژی و رویکرد فازی
Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...
متن کاملA Comprehensive Analyze of Stemming Algorithms for Indian and Non-indian Languages
Stemming is a technique used for reducing inflected words to their stem or root form. This is applicable for both the suffix as well as prefix. Stemming is a preprocessing step in text mining application and commonly used for Natural Language Processing (NLP). A stemmer can execute operation of altering morphologically identical words to root word without performing morphological analysis of th...
متن کاملOverview of Stemming Algorithms for Indian and Non-Indian Languages
Stemming is a pre-processing step in Text Mining applications as well as a very common requirement of Natural Language processing functions. Stemming is the process for reducing inflected words to their stem. The main purpose of stemming is to reduce different grammatical forms / word forms of a word like its noun, adjective, verb, adverb etc. to its root form. Stemming is widely uses in Inform...
متن کامل